Scheduling in Switches with Small Internal Buffers: Extended Version

نویسندگان

  • Nikos Chrysos
  • Manolis Katevenis
چکیده

Unbuffered crossbars or switching fabrics contain no internal buffers, and function using only input (VOQ) and possibly output queues. Schedulers for such switches are complex, and introduce increased delay at medium loads, because they have to admit at most one cell per input and per output, during each time slot. Buffered crossbars, on the other hand, contain sufficient internal buffering (N buffers) to allow independent schedulers to concurrently forward packets to the same output from any number of inputs. These architectures represent the two extremes in a range of solutions, which we examine here; although intermediate points in this range are of reduced practical interest for crossbars, they are nevertheless quite interesting for switching fabrics, and they may be of interest for optical switches. We find that tolerating two cells per-output per timeslot, using small buffers inside the switch or fabric, suffices for independent and efficient scheduling. First, we introduce a novel “request-grant” credit protocol, enabling N inputs to share a small switch buffer. Then, we apply this protocol to a switch with N such buffers, one per output, and we consider the resulting scheduling problem. Interestingly, this looks like unbuffered crossbar schedulers, but it is much simpler because it comprises independent, single-resource schedulers that can be pipelined. We show that individual buffer sizes do not need to grow, neither with switch size nor with propagation delay. Through simulations, we study performance as a function of the number of cells allowed per-output per-time-slot. For one cell, the switch performs very close to the iSLIP unbuffered crossbar with one iteration. For more cells, performance improves quickly; for 12 cells, packet delay under (smooth) uniform load is practically as low as ideal output queueing. Under unbalanced load, throughput is superior to buffered crossbars, due to better buffer sharing. 1 . INTRODUCTION Networks need fast and low-cost packet switches to keep pace with the increase in communication demand. Switches employ ingress and egress linecards, which usually contain sizable buffer memories, and a core, which is a crossbar or a switching fabric. Packet switch architectures belong to two principal categories, depending on their core: bufferless or buffered. Crossbars were bufferless, but are now evolving to architectures with buffers per-crosspoint, owing to advances in IC technology that allow increased on-chip memory; analogous trends exist for fabrics made of multiple smaller switching elements. This paper studies the spectrum of ‡ The authors are also with the Department. of Computer Science, University of Crete, Heraklion, Crete, Greece. cells output cells output cells output N 2 zero Fig. 1. Placement of buffers in a crossbar. Starting from the left, we have a bufferless crossbar, then a buffered crossbar, containing N buffers, i.e. N per output, and, last, on the right, a system with less than N buffers per output –two cell buffers per output are shown in the figure, but, in general, the number of cells that fit in the aggregate output buffer can be either a constant independent of N , or a sub linear growing function of N . In this paper we consider the latter type of switches (or fabrics), i.e. buffered, with less than N buffers per output. In the figure of the bufferless crossbar, the red cell, shown in dashed line, cannot proceed as its output is occupied by an other cell; in buffered switches however, the cell can proceed, as it can be stored inside an output buffer. intermediate solutions between the two extremes of bufferless and buffered crossbars. Our study provides indications that most of the advantages of buffered architectures –simple and efficient, distributed, pipelined scheduling– can be achieved with considerably less total buffer space compared to what buffered crossbars currently employ. With unbuffered core, output conflicts must be avoided before packets enter the core: in each time-slot1, only a single cell in the crossbar can use a specific crossbar output, and only one cell in the crossbar can use a specific crossbar input; in graph theory, crossbar scheduling is equivalent to bipartite, input to output, graph matching. This problem requires a central scheduler to coordinate the set of input/output pairs (flows, or connections) that will be in the crossbar in each time-slot [1] [2] [3]; this is a complex task that can limit the switch packet rate. Heuristic algorithms that have been adopted today work well only when internal speedup is used to compensate for their scheduling inefficiencies [4]. Because these algorithms operate only on fixed-size units, additional speedup is needed when external packets have variable size, to compensate for segmentation padding. Buffered architectures ease scheduling by allowing conflict1a time-slot is equal with a cell time, that is the time it takes to transmit a cell at rate λ.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scheduling Algorithms for CIOQ Switches

This proposal deals with the design of scheduling algorithms for Combined Input and Output Queued (CIOQ) switches. For crossbar based switches, we demonstrate the poor performance of commonly used scheduling algorithms under overload traffic conditions using targeted stress tests and present ideas to develop robust, stress resistant versions of these algorithms which are still simple enough to ...

متن کامل

Multicast Traffic Scheduling Based On High-Speed Crossbar Switches

The tremendous growth of the Internet coupled with newly emerging applications has created a vital need for multicast traffic support by backbone routers and ATM switches. In this paper, we first introduce the multicast traffic scheduling problem. We focus our study on the multicast traffic scheduling in crossbar based input queued (IQ) switches. Due to the centralized scheduling complexity in ...

متن کامل

A Hybrid Approach for Fuzzy Just-In-Time Flow Shop Scheduling with Limited Buffers and Deteriorating Jobs

This paper investigates the problem of just-in-time permutation flow shop scheduling with limited buffers and linear job deterioration in an uncertain environment. The fuzzy set theory is applied to describe this situation. A novel mixed-integer nonlinear program is presented to minimize the weighted sum of fuzzy earliness and tardiness penalties. Due to the computational complexities, the prop...

متن کامل

The Least Choice First (LCF) Scheduling Methodfor High-speed Network Switches

We describe a novel method for scheduling high-speed network switches. The targeted architecture is an input-buffered switch with a non-blocking switch fabric. The input buffers are organized as virtual output queues to avoid head-of-line blocking. The task of the scheduler is to decide when the input ports can forward packets from the virtual output queues to the corresponding output ports. Ou...

متن کامل

Weighted Fairness in Buffered Crossbar Scheduling

The crossbar is the most popular packet switch architecture. By adding small buffers at the crosspoints, important advantages can be obtained: (1) Crossbar scheduling is simplified. (2) High throughput is achievable. (3) Weighted scheduling becomes feasible. In this paper we study the fairness properties of a buffered crossbar with weighted fair schedulers. We show by means of simulation that, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005